14 research outputs found

    Graph theoretic generalizations of clique: optimization and extensions

    Get PDF
    This dissertation considers graph theoretic generalizations of the maximum clique problem. Models that were originally proposed in social network analysis literature, are investigated from a mathematical programming perspective for the first time. A social network is usually represented by a graph, and cliques were the first models of "tightly knit groups" in social networks, referred to as cohesive subgroups. Cliques are idealized models and their overly restrictive nature motivated the development of clique relaxations that relax different aspects of a clique. Identifying large cohesive subgroups in social networks has traditionally been used in criminal network analysis to study organized crimes such as terrorism, narcotics and money laundering. More recent applications are in clustering and data mining wireless networks, biological networks as well as graph models of databases and the internet. This research has the potential to impact homeland security, bioinformatics, internet research and telecommunication industry among others. The focus of this dissertation is a degree-based relaxation called k-plex. A distance-based relaxation called k-clique and a diameter-based relaxation called k-club are also investigated in this dissertation. We present the first systematic study of the complexity aspects of these problems and application of mathematical programming techniques in solving them. Graph theoretic properties of the models are identified and used in the development of theory and algorithms. Optimization problems associated with the three models are formulated as binary integer programs and the properties of the associated polytopes are investigated. Facets and valid inequalities are identified based on combinatorial arguments. A branch-and-cut framework is designed and implemented to solve the optimization problems exactly. Specialized preprocessing techniques are developed that, in conjunction with the branch-and-cut algorithm, optimally solve the problems on real-life power law graphs, which is a general class of graphs that include social and biological networks. Computational experiments are performed to study the effectiveness of the proposed solution procedures on benchmark instances and real-life instances. The relationship of these models to the classical maximum clique problem is studied, leading to several interesting observations including a new compact integer programming formulation. We also prove new continuous non-linear formulations for the classical maximum independent set problem which maximize continuous functions over the unit hypercube, and characterize its local and global maxima. Finally, clustering and network design extensions of the clique relaxation models are explored

    Constructing Test Functions for Global Optimization Using Continuous Formulations of Graph Problems

    No full text
    A method for constructing test functions for global optimization which utilizes continuous formulations of combinatorial optimization problems is suggested. In particular, global optimization formulations for the maximum independent set, maximum clique and MAX CUT problems on arbitrary graphs are considered, and proofs for some of them are given. A number of sample test functions based on these formulations are proposed

    Graph Domination, Coloring and Cliques in Telecommunications

    No full text
    This paper aims to provide a detailed survey of existing graph models and algorithms for important problems that arise in di#erent areas of wireless telecommunication. In particular, applications of graph optimization problems such as minimum dominating set, minimum vertex coloring and maximum clique in multihop wireless networks are discussed

    Novel approaches for analyzing biological networks

    No full text
    This paper proposes clique relaxations to identify clusters in biological networks. In particular, the maximum n-clique and maximum n-club problems on an arbitrary graph are introduced and their recognition versions are shown to be NP-complete. In addition, integer programming formulations are proposed and the results of sample numerical experiments performed on biological networks are reported

    An Ellipsoidal Bounding Scheme for the Quasi-Clique Number of a Graph

    No full text

    The Minimum Spanning K-Core Problem With Bounded Cvar Under Probabilistic Edge Failures

    No full text
    This article introduces the minimum spanning k-core problem that seeks to find a spanning subgraph with a minimum degree of at least k (also known as a k-core) that minimizes the total cost of the edges in the subgraph. The concept of k-cores was introduced in social network analysis to identify denser portions of a social network. We exploit the graph-theoretic properties of this model to introduce a new approach to survivable interhub network design via spanning k-cores; the model preserves connectivity and diameter under limited edge failures. The deterministic version of the problem is polynomial-time solvable due to its equivalence to generalized graph matching. We propose two conditional value-at-risk (CVaR) constrained optimization models t obtain risk-averse solutions for the minimum spanning k-core problem under probabilistic edge failures. We present polyhedral reformulations of the convex piecewise linear loss functions used in these models that enable Benders-like decomposition approaches. A decomposition and branch-and-cut approach is then developed to solve the scenario-based approximation of the CVaR-constrained minimum spanning k-core problem for the aforementioned loss functions. The computational performance of the algorithm is investigated via numerical experiments

    SNP variable selection by generalized graph domination.

    No full text
    BACKGROUND:High-throughput sequencing technology has revolutionized both medical and biological research by generating exceedingly large numbers of genetic variants. The resulting datasets share a number of common characteristics that might lead to poor generalization capacity. Concerns include noise accumulated due to the large number of predictors, sparse information regarding the p≫n problem, and overfitting and model mis-identification resulting from spurious collinearity. Additionally, complex correlation patterns are present among variables. As a consequence, reliable variable selection techniques play a pivotal role in predictive analysis, generalization capability, and robustness in clustering, as well as interpretability of the derived models. METHODS AND FINDINGS:K-dominating set, a parameterized graph-theoretic generalization model, was used to model SNP (single nucleotide polymorphism) data as a similarity network and searched for representative SNP variables. In particular, each SNP was represented as a vertex in the graph, (dis)similarity measures such as correlation coefficients or pairwise linkage disequilibrium were estimated to describe the relationship between each pair of SNPs; a pair of vertices are adjacent, i.e. joined by an edge, if the pairwise similarity measure exceeds a user-specified threshold. A minimum k-dominating set in the SNP graph was then made as the smallest subset such that every SNP that is excluded from the subset has at least k neighbors in the selected ones. The strength of k-dominating set selection in identifying independent variables, and in culling representative variables that are highly correlated with others, was demonstrated by a simulated dataset. The advantages of k-dominating set variable selection were also illustrated in two applications: pedigree reconstruction using SNP profiles of 1,372 Douglas-fir trees, and species delineation for 226 grasshopper mouse samples. A C++ source code that implements SNP-SELECT and uses Gurobi optimization solver for the k-dominating set variable selection is available (https://github.com/transgenomicsosu/SNP-SELECT)
    corecore